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Abstract 


This paper addresses the identification of insurance models with multidimen¬ 
sional screening where insurees have private information about their risk and risk 
aversion. The model includes a random damage and the possibility of several claims. 
Screening of insurees relies on their certainty equivalence. The paper then investi¬ 
gates how data availability on the number of offered coverages and reported claims 
affects the identification of the model primitives under four different scenarios. We 
show that the model structure is identified despite bunching due to multidimen¬ 
sional screening and/or a finite number of offered coverages. The observed number 
of claims plays a key role in the identification of the joint distribution of risk and risk 
aversion. In addition, the paper derives all the restrictions imposed by the model 
on observables. Our results are constructive with explicit equations for estimation 
and model testing. 

Keywords: Insurance, Identification, Adverse Selection, Multidimensional Screening. 



Identification of Insurance Models with 
Multidimensional Screening 

G. Aryal, I. Perrigne & Q. Vuong 

1 Introduction 

Insurance has been a long studied problem in economics and is in the core of recent empir¬ 
ical research. Seminal papers by Rothschild and Stiglitz (1976) and Stiglitz (1977) have 
provided benchmark models of insurance under private information on insurees’ risk. In 
empirical studies, testing adverse selection in risk has generated a large number of papers 
with mixed results. See Chiaporri and Salanie (2000) for the most well known test and 
Cohen and Siegelman (2010) for a survey of empirical findings. The recent empirical 
literature shows that adverse selection not only involves heterogeneity in risk but also 
in risk aversion, which is also called advantageous selection. See e.g. Finkelstein and 
McGarry (2006) in long-term care insurance, Cohen and Einav (2007) in automobile in¬ 
surance, Fang, Keane and Silverman (2008) in health insurance, and Einav, Finkelstein 
and Schrimpf (2010) in annuity market. See also Cutler, Finkelstein and McGarry (2008) 
and Einav and Finkelstein (2011) for surveys. As noted in these papers, heterogeneity in 
risk aversion may contradict the prediction of the benchmark adverse selection models, 
i.e., a low risk individual may buy a higher coverage because of high risk aversion and con¬ 
versely. Thus, a model of insurance needs also to incorporate incomplete information in 
risk aversion leading to multidimensional screening. This is known to be a difficult theo¬ 
retical problem because of the violation of the Spence-Mirrlees (single-crossing) condition. 
See Rochet and Stole (2003) for a survey on multidimensional screening. 


1 



In this paper, we propose a model of insurance that includes private information in 
both risk and risk aversion as well as random damages and the possibility of several claims 
while endogenizing the contract terms. Following Landsberger and Meilijson (1999), we 
consider the certainty equivalence of no insurance as a one-dimensional representation of 
insurees’ types as this representation preserves the order of insurees after buying insurance. 
For convenience, we assume a constant absolute risk aversion and a nonparametric mixture 
of a Poisson distribution for the number of potential claims as they lead to a tractable 
form for the certainty equivalence. In the spirit of the theoretical literature, we consider 
automobile insurance with coverages of the form premium and deductible. Our model 
contains the key ingredients of insurance and can be extended to other insurance markets 
such as health by adding (say) a copayment. Thus, the model structure is defined by the 
joint distribution of risk and risk aversion and the distribution of damages. Within this 
model, we study the identification of the primitives. Identification is a key step for the 
econometric and empirical analysis of structural models. 

Starting with Koopmans (1949) and Hurwicz (1950), the problem of identification 
has a long history. As discussed by Heckman (2001), the labor literature provides sev¬ 
eral examples of the role played by identification in empirical studies. Over the past 
fifteen years, it has received much attention with the development of structural models in 
empirical industrial organization. See Athey and Haile (2007) for a survey on the identi¬ 
fication of auction models Q The problem of (nonparametric) identification is important 
for several reasons. First, it allows to assess the conditions required (if any) to recover 
uniquely the model structure from the observables while minimizing parametric assump¬ 
tions. Second, it highlights which variations in the data allows one to identify each model 
primitive. Third, some important questions related to the structural analysis of models 
can be addressed once identification is established. One can think of which distribution 
of the data can be rationalized by the model, or what restrictions the model imposes on 
the observables that can be used to test the model validity. 

Several lessons can be drawn from the recent literature on the identification of models 
1 See also Matzkin (1994, 2007) for the nonparametric identification of models with nonseparable errors. 
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with incomplete information. First, the optimal behavior of economic agents plays an 
important role. For instance, in contract models, the optimality of the offered payment 
is useful in addition to the optimal agents’ behavior. See Perrigne and Vuong (2011) for 
a procurement model with adverse selection and moral hazard and Luo, Perrigne and 
Vuong (2015) for nonlinear pricing. Thus, in most cases we need to consider both sides 
of the market, i.e. the principal and the agent(s), and assume that the observations are 
the equilibrium outcomes. Second, one achieves identification with standard identifying 
strategies such as instrumental variables and exclusion restrictions, that have been widely 
used in the early literature on identification. See Guerre, Perrigne and Vuong (2009) for 
the identification of risk aversion in auctions and Berry and Haile (2014) for a recent 
contribution to the identification of multinomial choice demand models. Third, the one- 
to-one equilibrium mapping between the unobserved agent’s private information and the 
observed outcome is a key element on which identification relics. See e.g. Guerre, Perrigne 
and Vuong (2000) and Athey and Haile (2007) in the context of auctions. 

Our paper differs from this literature in several dimensions. First, we consider a 
model with multidimensional screening in which bunching/pooling cannot be avoided. 
In this case, identification cannot rely exclusively on the one-to-one mapping between 
the agent’s unobserved types and his observed out come/ action]^] Second, our model also 
considers the possibility of a finite number of options/contracts offered to each agent, 
while agents’ types are distributed over a continuum. In addition to the bunching arising 
from multidimensional screening, additional bunching arises because of a finite number of 
contracts. This represents an additional challenge in the study of identification^] 

2 Relying on Rochet and Chone (1998), Pioner (2007) addresses the semiparametric identification of 
bidimensional screening models in a nonlinear pricing context but assumes that one of the two agent’s 
types is observed by the analyst. Aryal (2015) considers nonparametric identification with multidimen¬ 
sional types. See also Luo, Perrigne and Vuong (2012, 2013) who study identification of nonlinear pricing 
models with multiple types relying on Armstrong (1996) model. The latter papers use optimality of both 

the principal and the agent as well as observations from multiple markets to identify the model primitives. 

3 Crawford and Shum (2007) consider two contracts while agents’ types can take only two values 

thereby avoiding any bunching. Gayle and Miller (2015) adopt a similar strategy. Leslie (2004) entertains 
a finite number of price options through a discrete choice model to analyze consumers’ behavior taking 
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To study identification of the model primitives and assess how data availability affects 
identification, we proceed as follows. We consider several data scenarios depending on the 
number of offered coverages and reported claims, namely whether the number of coverages 
is a continuum or finite and whether the claims contain all the information, or only those 
above the deductible. This strategy allows us to assess how data constraint or limit the 
identification of primitives, and which identifying assumptions are needed. Moreover, 
studying the identification under a continuum of coverages is important as a negative 
identification result would imply nonidentification of the model primitives under a finite 
number of coverages. A first data scenario exploits the one-to-one mapping between the 
level of certainty equivalence and the deductible to identify the distribution of certainty 
equivalence. The number of claims then plays a crucial role in identifying the joint 
distribution of risk and risk aversion. A second data scenario maintains a continuum of 
contracts but considers a damage distribution truncated at the deductible. Because a 
continuum of contracts is offered, the subpopulation choosing full insurance, i.e., a zero 
deductible, identifies the damage distribution and the argument of the first case applies. 

When considering a finite number of contracts in the third and the fourth scenarios, 
identification becomes more challenging as we cannot exploit a one-to-one mapping be¬ 
tween (say) the deductible and the insuree’s private information. Though the context is 
different, the number of claims continues to play a key role in identifying the marginal 
distribution of risk. Regarding the identification of the joint distribution of risk and risk 
aversion, we exploit an exclusion restriction and a full support assumption requiring suf¬ 
ficient variations in some exogenous characteristics. Under these assumptions, the model 
structure is identified when the damage distribution is fully observed. On the other hand, 
when the damage distribution is truncated at the deductible, we obtain identification 
of the structure up to the knowledge of the probability that the damage is below the 
deductible. The latter probability is not identified. We then discuss some identifying 
assumptions for the probability of damage below the deductible. A notable feature of 
our results under a finite number of contracts is that they do not rely on the optimality 
the price schedule as exogenous. 
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of the offered coverages. Consequently, our results apply to any form of competition in 
the insurance industry. To complete these results, we derive all the model restrictions 
on the observables in the fourth data scenario. These restrictions can be used to test 
the validity of the model and its assumptions. For instance, a model restriction allows 
a test of optimality of the offered coverages. This contrasts with the previous literature 
as discussed above, and our results represent a novel perspective to the identification of 
models under incomplete information. In addition, all our results are constructive and 
provide explicit equations for estimation and testing. 

The paper is organized as follows. Section 2 presents the model. Sections 3 and 4 study 
identification under a continuum of contracts and under a finite number of contracts, 
respectively. Section 5 discusses some identifying strategies for the damage probability 
below the deductible and derives all the restrictions imposed by the model on observables. 
Section 6 concludes with future lines of research. An appendix collects the proofs. 

2 A Model of Insurance 

This section develops a model in which insurees have private information about their risk 
and risk aversion. The presence of multiple private information leads to multidimensional 
screening with pooling at equilibrium. See Rochet and Stole (2003) for a survey. Following 
Landsberger and Meilijson (1999), we use the concept of certainty equivalence to rank and 
screen insurees. To fix ideas and in the spirit of the early literature, we consider automobile 
insurance as an example throughout the paper though our framework also applies to (say) 
homeowner and rental insurance. See the end of this section for a discussion of health 
insurance. 

The Benchmark Model by Stiglitz 

This section briefly reviews the Stiglitz (1977) model and motivates our model that 
incorporates heterogenous preferences and a random damage/expense. It also introduces 
basic notations. Insurees are characterized by a probability of accident (risk) 6 e [9, 6] 
distributed as F(-) with a density /(•). An accident involves a fixed damage D affecting 
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the insuree’s wealth w. Because agents are risk averse, they buy insurance by paying a 
premium t. The insurance company requires a deductible dd for each accident. Upon 
buying insurance, the agent’s wealth is w — t in the event of no accident with probability 
1 — 9 and w — t — D + (D — dd) = w — t — dd in the event of an accident with probability 
9. His expected utility is then V(t, dd ; 9) — (1 — 9)U{w — t) + 9U(w — t — dd ), where U(-) 
is a von Neumann-Morgenstern utility function, which is continuous, strictly increasing 
and concave. The risk 9 is private information while U(-) is known by the insurer. 

In an incomplete information setting, the insurance company offers contracts of the 
form [t(9), dd{9)\ that are incentive compatible. The firm’s profit from a 0-insuree is 
7 t(9) = (1 — 9)t{9) + 9[t{9) — D + dd{9)} = t{9) — 9(D — dd(9)). Because risk is unknown, 
the insurance company maximizes its expected profit subject to the insuree’s incentive 
compatibility (IC) and participation (IR) constraints, namely 

r 0 


max / ir(9)f(9)d9 


s.t. 


°-t\9)- 


U'(w -1(9 )) 


t{9) <M{d) 9 " x "'U'(w-t(9)^dd(9)) 

V(t(9), dd(9)] 9)>{l- 9)U(w) + 9U(w - D ) 


(IC) 

(IR), 


where the RHS of (IR) expresses the agent’s expected utility with no insurance]^] 

The main findings of this model are as follows. First, pooling is not optimal and the 
firm benefits from offering a continuum of contracts. The individual with the highest risk, 
i.e., 9, is offered full insurance with a zero deductible. Second, premium and deductible 
are inversely related. In addition, the premium is a convex function of the deductible 
implying a larger marginal price for lower deductibles. Third, the optimal coverage may 
entail some optimal exclusion for insurees with a low probability of accident. 

Though insurance contracts can include several features such as copayments and 
hard limits, it is worth noting that Arrow (1963) shows the optimality of the premium- 
deductible contract. Intuitively, the latter allows the best risk-sharing between a risk neu¬ 
tral insurer and a risk averse insuree as it is the best compromise between the willingness 

4 Because U(w) — U(w — D) > 0, there is no countervailing incentives as defined by Lewis and Sap- 
pington (1989). 
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to reduce risk and the need to limit the insurance deadweight cost. Furthermore, Gollier 
and Schlesinger (1993) show that any other form of insurance contract is dominated by a 
contract with a deductible and a premium implying that the deductible-premium coverage 
maximizes insurer’s profit over all other possible forms of implementable contracts. 

The above model assumes at most one accident with a fixed damage and the same 
(known) risk aversion across insurees. In reality, there might be more than one accident 
over the policy period and every accident involves a random damage. Moreover, as shown 
by Finkelstein and McGarry (2006) and Cohen and Einav (2007), the variability in risk 
aversion might be more important than the variability in risk across insurees. It is also 
natural to consider that insuree’s risk aversion is as private as his probability of accident. 
Consequently, asymmetric information becomes bidimensional. Ignoring heterogeneity in 
risk aversion may have serious consequences on insurance policy design. For instance, an 
insuree with a low probability of accident and a high risk aversion may buy a contract 
with a high level of coverage (or low deductible) and conversely. This is also known as 
advantageous selection in the insurance literature. In contrast, when heterogeneity in risk 
aversion is ignored as in the above model, this insuree should buy a low level of coverage. 
In addition, the distribution of damages as well as the expected number of accidents have 
an important impact on the choice of deductible relative to the premium offered by the 
insurer. In view of this discussion, our model includes multiple accidents with random 
damage and heterogeneity in privately known risk aversion. In view of data availability, 
our model also considers a finite number of offered contracts/coverages. 

Model Assumptions 

We make the following assumptions. In our model, 9 is the insuree’s risk measured as 
the expected number of accidents over the period of coverage. 

Assumption Al: 

(%) The insuree’s utility function exhibits Constant Absolute Risk Aversion (CARA), i.e., 
U ( x ; a) = — exp (—ax), a > 0, 

(ii) The pairs (9, a) are i.i.d. as F(-, •) which is twice continuously differentiable on its 
support <3 x A = [9,9] x [a, a] C IR++ x IR ++ , 
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(in) Each insuree may be involved in J accidents, which conditional on 6, follows a Poisson 
distribution, i.e. pj{9 ) = Pr[J = j\9] = e~ 6 9^ fj\, 

(iv) J is independent across insurees and each accident involves a damage Dj, j = 
1 ,..., J. The damages are i.i.d as H(-) on support [0, d\ C 1R + , 

(v) Dj,j = 1,..., J is independent of ( 6 , a). 

By Al-(i), the utility function is strictly increasing and concave. The CARA specifi¬ 
cation has two main advantages: (i) It leads to a tractable expression for the certainty 
equivalence and (ii) the attitude toward risk in changes in wealth is independent of initial 
wealth. These properties have made the CARA utility a popular choice in the theoretical 
and empirical literature. By Al-(ii), each insuree is characterized by a pair (6 , a) which is 
private information. Assumption Al-(iii) specifies the distribution of accidents as Poisson 
with mean 6. This distribution is widely used in actuarial science to model the number of 
accidents. The combination of the CARA utility and the Poisson distribution is especially 
convenient as it leads to explicit expressions for the certainty equivalence defined later. 
Since 6 is random by Al-(ii), and its marginal distribution is left unspecified, the distribu¬ 
tion of the number of accidents in the population is a nonparametric mixture of Poisson 
distribution thereby adding flexibility]^] Relaxing the CARA and/or Poisson specifications 
is possible at the cost of obtaining implicit expressions for the certainty equivalence. Our 
identification results of Section 3 and 4 would still hold provided the distribution of the 
number of accidents belongs to the class of distributions whose nonparametric mixture is 
identified. See Rao (1992). By Al-(iv,v), damages are random, mutually independent and 
independent of types (9, a). We view the damage as being affected by exogenous factors 
such as bad luck, weather or road conditions. Its independence with (9, a) excludes moral 
hazard as (say) risk averse agents’ action might reduce the damage per accident. This 
issue is left for future research. Section 5.2 discusses how Al-(iv,v) can be tested in view 
of the restriction it implies on observables. 

Lastly, following Stiglitz (1977) and empirical papers such as Cohen and Einav (2007) 
among others, we assume the insurer acts as a monopolist. The concentration ratios and 


5 Cohen and Einav (2007) consider a log normal mixture of Poisson distribution. 



the profits made in the insurance industry indicate that it is not a competitive market. 
See Chiappori, Julien, Salanie and Salanie (2006) for automobile insurance and Dafny 
(2010) and Stare (2014) for health insurance. For instance, switching costs for automobile 
and home insurance and/or the limited number of employer offered coverages in health 
insurance may prevent insurees to benefit from competition. See Israel (2005a,b) and 
Honka (2014) for evidence in the automobile industry. Considering an oligopoly would 
add great complexity to the model because of the increasing dimension of adverse selection 
due to product differentiation. In view of this, we consider the monopoly as a reasonable 
trade-off. 

The model primitives [F(■,■), H(■)} are common knowledge. The timing is as follows. 
Each insuree draws independently a pair of types (9, a) from F (•, •). The insurance com¬ 
pany proposes a menu of insurance contracts of the form [t, dd ], where dd is the deductible 
per accident. The insuree chooses the contract that maximizes his utility and pays the 
corresponding premium. In case of an accident with damage below the deductible, the 
insuree pays for it. Otherwise, the insurer pays the damage above the deductible and the 
insuree pays the deductible. 


Insurer’s Optimization Problem 

The insurer offers a continuum of contracts [t(9 } a), dd{9, a)] for (9, a) G O x A. Inte¬ 
grating over (9, a), the insurer’s expected profit is given by 


E[7t(0, a) 



t(9, a) — 9 I max{0, D — dd(9, a)}dH(D) 

Jo 


dF(9,a ) 



t(9,a)-6 [ (1 -H{D))dD 

J dd(6,a) 


dF{9 , a), 


( 1 ) 


where max{0, D — dd(9,a)} reflects that the insurer only covers the damage above the 
deductible. The first equality uses that damages are i.i.d conditional on (9, a) by Al- 
(iv,v) while the second equality follows from integration by parts. The inside integral is 
the expected payment per accident while 9 is the expected number of accidents. 
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For a ( 9 , a)-individual with wealth w, his expected utility without insurance is 


U(0, 0; 9, a ) = p 0 (9)U(w, a) + Pl {6)E[U(w - Dp, a)] + p 2 (0)E[U(w - D 1 - Dp a)] + 
= -po(9)e~ aw - Pl (9)e- aw E[e aDl ] - p 2 (9)e- aw E[e aDl }E[e aD2 } 

= -e~ aw [po(6) +P1 Wa + p 2 ml + ...] 


1 1 ! 2 ! 

„—aw+0{4>a— 1 ) 


+ 


( 2 ) 


where 0 a = E[e aD ] > 1, and the expectation is with respect to D. The first equality 
considers all the possibilities regarding the number of accidents and their costs to an 
individual without insurance. The second equality uses the CARA utility function and 
the independence of damages across accidents by Al-(i,iv,v). The third equality uses 
damages being identically distributed by Al-(iv). Lastly, the fourth equality relies on 
the Poisson distribution of accidents by Al-(iii). Using the same derivation where w 
and Dj are replaced by w — t and min {(id, Dj), respectively, the expected utility of a 
(0, a)-individual buying insurance (t, dd ) is 


V(t, dd ; 0, a) = _ e -a(«'-t)+®(«-i) > (3) 

where 0* = E[e amm O w T}] = f e amin{dd,D} dH (D) = f* d e aD dH(D) + e add (l - H(dd)) > 1. 
We remark that 0* < <p a as minjdd, D} < D. 

Given a menu of contracts, the (0, a)-individual chooses the contract that maximizes 
his expected utility as defined above. Following the revelation principle, we can focus on 
a direct mechanism that maps types to contract terms, i.e. [t(0, a), dd{0 , a)]. The insurer, 
however, should choose implementable contracts that satisfy the insuree’s optimization 
or (IC) constraint as well as the insuree’s participation or (IR) constraint. This gives the 
following optimization problem 


max Ebr (9, a)] (4) 

hv)Mv) 

s.t. V[t(9, a), dd{0, a); 0, a] > V[t(0, a), dd{9, a); 9 , a] V(d, a) G 0 x A (IC) 
V[t(0,a),dd(0,a);0,a]>V(O,O-,0,a) V(9,a)e9xA (IR), 
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where the expected profit is given in (1). The (IC) constraint ensures that the ( 6, a)- 
individual chooses the contract (t(6, a), dd(9, a)). The (IR) constraint guarantees that 
buying this contract is better for this individual than having no insurance. 

As is well known, multidimensional types leads to a complex screening problem. See 
Rochet and Stole (2003). As noted previously, an insuree with high risk but low risk 
aversion might have the same willingness to pay for a given coverage (t, dd) as an insuree 
with low risk but high risk aversion. This substitutability between risk and risk aversion 
implies that a separating equilibrium, where each individual (9, a) gets a unique coverage, 
is infeasible. Thus, pooling occurs across insurees. Intuitively, insurees have two sources 
of private information while the insurer has in fact a single instrument, the deductible, to 
screen insurees. Indeed, the premium and deductible are inversely related as a contract 
(t, dd) will be always preferred to any other contract (t, dd') with dd' > dd. Thus, the 
insurer’s objective is to find the best way to pool insurees such that offered coverages are 
feasible, i.e., satisfy the (IC) and (IR) constraints, while maximizing its expected profit]^] 

Certainty Equivalence 

Following Landsberger and Meilijson (1999), we use the certainty equivalence of no 
insurance as a one-dimensional aggregation of the two dimensions of private information. 
A similar aggregation approach was proposed by Laffont, Maskin and Rochet (1987) Q 
Screening based on certainty equivalence has two main advantages. First, it does not 
rely as much on parametric specifications of the model primitives. Second, certainty 
equivalence has a natural economic interpretation. See also Armstrong (1996) who uses 
the production cost for a multiproduct firm to screen consumers with multidimensional 
types. We make the following assumption. 

6 A simple argument shows that screening on risk or risk aversion only is not optimal for the insurer. 
Consider three individuals ( 6 *i,ai), ( 62 , 0 , 1 ) and ( 6 * 2 , 02 ) with 6*1 < 62 and oi < 02 , then having (say) the 
first and second buying the same coverage is not optimal for the insurer’s profit. 

'See also Ivaldi and Martimort (1994) for an application to competitive nonlinear pricing. For alter¬ 
native approaches, see (say) Wilson (1993) who adopt a partitioning of the types set into one-dimension 
subsets, Rochet and Chone (1998) who propose a general approach for multidimensional screening when 
the number of types is equal to the number of instruments, and Basov (2001) for the general case. 
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Assumption A2: For any given coverage (t, dd ) , the difference V(t, dd ; 9, a)—V(0, 0; 9, a) 
is increasing in a. 

We remark that the above difference is automatically increasing in 6. Thus, individuals 
with higher risk or risk aversion value insurance more than those with lower risk or risk 
aversion. Assumption A2 ensures that there will be no countervailing incentives because 
the (IR) constraint in (4), namely V(t, dd ; 6, a) — V (0, 0; 9,a) > 0 has a LHS increasing in 
both (9, a). We note that A2 restricts the coverage ( t,dd ) for a (6, a)-individual relative 
to the damage distribution. This assumption can be verified ex-post upon identification 
of the model primitives. 

The certainty equivalence CE( 0, 0; 9, a) of no insurance coverage is defined by the 
amount of certain wealth for the insuree that will give him the same level of utility when 
he has no coverage, i.e., — exp(— aCE(0, 0; 9, a)) = V(0,0;9,a). Thus, by (2) 

CE(0, 0; 9, a) = w -—--. (5) 

a 

The certainty equivalence CE(t, dd ; 9, a) of having the coverage (t, dd) is defined similarly 
as the amount of certain wealth for the insuree that will give him the same level of utility 
when buying coverage, i.e. —exp (—aCE(t,dd]9,a)) = V(t,dd]9,a). Thus, by (3) 

CE(t,dd]9,a) — w — t -—--. (6) 

a 

The next lemma establishes the monotonicity in ( 9 , a) of these certainty equivalences. All 
proofs are in the Appendix. 

Lemma 1: The certainty equivalences (5) and (6) are both decreasing in risk and risk 
aversion. 

The certainty equivalence of no insurance in (5) defines a locus of pairs ( 9 , a) as a 
downward sloping curve 9(a) for any given value s of certainty equivalence. Because 
s = CE(0,0 m ,9,a) is a function of (9, a), namely s(9,a), it is random and distributed as 
K(-) with some density k(-) on [s, s], where s = s(9,a) and s = s(9,a), respectively. 
Figure [T| displays some s-isocurves. 

Solving the Multidimensional Screening Problem 
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Figure 1: Certainty Equivalence 


The optimization problem (4) is known to be difficult to solve because of multidi¬ 
mensional private information and the loss of the single-crossing property. The literature 
on multidimensional screening shows that pooling at equilibrium cannot be avoided. To 
make the parallel with the literature on multidimensional screening with a focus on non¬ 
linear pricing, we remark that the premium t plays the role of the payment and — dd 
plays the role of the quantity as seen in (3) and (6). Thus, dd is the only instrument 
for two dimensional types. The certainty equivalence without insurance aggregates these 
two dimensions into a single one. We then rewrite the optimization problem (4) in terms 
of .s = CE( 0, 0; 9 , a) and the (IC) and (IR) constraints using the certainty equivalences 
CE(t,dd]9,a ) and CE(0, 0; 9, a). Cohen and Einav (2007) also use the certainty equiva¬ 
lence to explain the choice of coverage by insurees. 

Here, we consider a continuum of contracts [t(s), dd(s)} for s e [s, s]. Thus all insurees 
with the same value of certainty equivalence s are pooled. Intuitively, the insuree with the 
highest outside option or the highest certainty equivalence will be treated as the individual 
with the lowest willingness-to-pay or the lowest risk individual in Stiglitz (1977). The 
insurer needs to propose an attractive coverage with a high deductible to induce truth¬ 
telling and participation because he values insurance the least. On the other hand, the 
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individual with the lowest outside option or the lowest certainty equivalence is offered full 
coverage or dd = 0 as shown later. Landsberger and Meilijson (1999) show that optimal 
insurance contracts preserve the order of certainty equivalence, i.e., for any pair of types 
( 0,a ) and (O', a') such that s(6,a ) < s(6',a') or s < s', the optimal contract [t(s), dd(s)] 
satisfies s(0,a ) < CE(t(s),dd(s)]0,a ) < s(0',a') < CE(t(s') , dd(s'); O' , a') . The first and 
third inequalities come from the (IR) constraints, i.e., an individual will buy insurance if 
his utility is larger than not buying insurance. This property ensures that screening on s 
is implementable. 

We rewrite the expected profit (1) in terms of s. Noting t(0,a) = t(s) and dd(0,a ) = 
dd(s) and making the change of variables (0, a) to (0, s ) in (1) give 


where k(-) is the density of certainty equivalence s. The (IC) and (IR) constraints in (4) 
become 



t(s)-E(0\s) 


' dd(s) 


(1 -H(D))dD 


k(s)ds , 


( 7 ) 


CE(t(s),dd(s ); 0, a) > CE(t(s ), dd(s ); 0, a) Vs G [s, s\ (JC) 

CE(t(s),dd(s)]0,a) > s, (IR) 


for all (0, a) satisfying CE(0,0] 0, a) = s and all s G [s, s]. The schedule [t(s), dd(s)] can 
be converted into the nonlinear premium t + (dd ) = t[s^ 1 (d,d)] which is decreasing and 
convex in deductible, i.e. the marginal price for higher coverage is increasing. This is 
similar to the concavity of tariff in nonlinear pricing models. See (say) Tirolc (1988). 

We note that for each s there must exist at least one (0, o)-individual or equivalently 
(0(s), a(s))-individual for whom the (IC) constraint binds. Thus, for this individual the 
(IC) constraint can be written as 


ma xCE(t(s), dd(s)]0(s),a(s))—max.w—t(s) 


0(s) \ J^ d{S) e a ^ D dH(D)+e a ^ dd ^ [1 -H(dd(s))] -1 


a(s) 


leading to the local (IC) constraint given by the first-order condition at s = s 


t'(s) = -0(s)e a{s)dd ^[ 1 - H(dd(s))]dd'(s), 
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where 6(s) = a(s)(w — s)/[f> a — 1] and the prime indicates a derivative. This gives 

dd'(s) = —r](s,a(s),dd(s))t'(s), 


( 8 ) 


for all s G [s, s], where 

r](s, a(s), dd(s)) = 


> 0 . 


(9) 


a(s)(w — — H(dd(s))} 

Equation (8) is the local incentive compatibility constraint for the insurer’s optimization 
problem. Regarding the individual rationality constraint, in view of the previous discus¬ 
sion, the s-individual has the largest outside option of no insurance. Thus, the insurer 
should bind the (IR) constraint for this individual and make him indifferent between 
buying insurance or not. This gives the (IR) constraint 


CE(t(s) : dd(s)] 9, a) = s. 


( 10 ) 


We can now solve the insurer’s problem which is to maximize his expected profit (7) 
subject to (8) and (10). Applying the Pontryagin principle (see Appendix), the optimal 
coverage (f(s), dd(s)) is solution of 


7j(s, a(s), dd(s))E(d|s)[l — H(dd(s))} 

K(s ) 1 d7j(s, a(s), dd(s)) 

k(s) r](s, a(s), dd(s)) ddd 

dd'(s) = —r](s, a(s), dd(s))t' (s), 


dd'(s) + 7/(s, a(s), dd(s)) 


=i.(n) 

( 12 ) 


where 7]'(s,a(s),dd(s)) denotes the total derivative of ?/(s, a(s), dd(s)) with respect to s, 
with the boundary condition CE(t(dd(s )), dd(s)] (9, a) = s. Evaluating (11) at s, i.e. for 
the (0,a) individual, shows that dd(s) = 0, i.e. the highest risk/risk averse individual 
is offered full coverage as in the benchmark model of Stiglitz (1977)]^] The next lemma 
implies that the deductible at equilibrium is increasing in s. 

Lemma 2: insurance contract [t(s),dd(s)] satisfies the (1C) constraint if and only if 

dd(s) is increasing in s. 

8 Because K(s) = 0, (11) and (9) give [4>a — l]E(0|s)/[a(u> — s)e add ^] = 1. Using (5) and E(0|s) = 9 
give = 1. 
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Since the equilibrium contract satisfies the (IC) constraint, its deductible is increasing in 
s. In other words, individuals with lower risk and/or risk aversion have lower coverage 
with a larger deductible. 

Finite Number of Contracts 

The principal may offer a finite number C of contracts from which the agent can 
choose. To simplify the presentation, we consider C — 2, where C is exogenous. Let 
(' ti,ddi ) and (t 2 ,dd 2 ) with ti < t 2 and dd\ > dd 2 be these two contracts. We show how 
the insurer can determine these two contracts optimally. In addition to the pooling of 
pairs (9, a) leading to the same certainty equivalence s, there is bunching of agents with 
different values of s. 

The insurer chooses (t i, ddi, t 2 , dd 2 ) to maximize his expected profit. Let S c be the set 
of agents choosing the contract (t c ,dd c ), c = 1,2. Similarly to (7), we have 

c -9 (1 -H{D))dD dF(9,a) = ^u c t c -E[9\S c ] / (1 -H(D))dD 

J dd c c= i J dd c 

where the second equality follows from f s 9dF(9,a ) = v c E[6\S c ] with u c = f s dF(9,a ) 
being the proportion of insurees choosing contract c. The optimal contracts also need to 
satisfy the incentive compatibility and participation constraints: 



CE(t c , d,d c -, 9 , a) > CE(t c > , dd c >,9, a),c ^ c’, V(6 ) , a) G S c ,c = 1, 2, 
CE(t c , dd c -, 9, a) > CE{ 0, 0; 9, a), \/(9, a) G S c ,c = 1, 2. 


The (IC) constraint reduces to two subsets dfi and S 2 that partition 0 x A such that 
individuals in S\ and S 2 choose (ti,ddi) and (t 2 ,dd 2 ), respectively. The frontier between 
S j and S 2 is determined by the locus of ( 9 , a)-insurees who are indifferent between the 
two contracts, i.e., for whom CE(ti, dd\, 9, a) = CE(t 2 ,dd 2 ;9,a). Using (6), the frontier 
is the strictly decreasing curve in 0 x A defined by 

0 / \ = _ a (t2 ~ h) _ 

J" 1 e aD dH(D ) + e add i(l - H(dd l )) - tf d2 e aD d,H{D ) - e add *(l - H(dd 2 )) 

= _ h _ ( 13 ) 

Jddi e aD (l- H(D))dD J 
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where the second equality uses integration by parts. Regarding the (IR) constraints, the 
only one that binds is for the (9, a)-insuree, i.e. CE(ti,ddi,9,a) = s. 

Maximizing E[ 7 r] with respect to (U, ddi, t 2 , dd 2 ) subject to the (IC) and (IR) con¬ 
straints gives the first-order conditions 



where p is the Lagrangian multiplier associated with the (IR) constraint and a* is the 
minimum of a and the value which solves (13) evaluated at 6. 


Extensions 

Our model extends to other insurance contracts such as health. Up to some variations, 
health insurance involves a premium t as well as a per period deductible dd and a co¬ 
payment 7 per medical procedure/visit. In particular, and in contrast to the automobile 
insurance, the deductible is not per visit while the copayment arises in the first proce- 
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dure/visit after the deductible is met. In this case, for a contract [t(9, a), dd(9, a), 7 ( 0 , a)], 
the insurer’s expected profit (1) becomes 

E[tt(M)] = f {t(9,a)-E[X(D 1 + ... + D J >dd(9,a))(D 1 + ... + D J 
JexA 

—dd{9 , a) — 7 (6, a)(J — </*))] } dF(9, a), 

where the expectation in the integral is with respect to the total expense D\ + ... + Dj, the 
number J of visits and J t which is the minimal number of visits for which the deductible 

is met, i.e., J' = argmin - =1 .jDi +... + Dj > dd. The per visit expenses Dj, j — 1,..., J 

may no longer be independent. Indeed, a patient with a medical condition will exhibit 
correlated medical expenses over the treatment period. Similarly, the per visit expense 
Dj might be correlated with the expected number of medical procedures/visits 9. 
Regarding the patient, his expected utility (2) without health insurance becomes 

1/(0, 0, 0; 9, a ) = E [U(w - D 1 - ... - Dj- a)] = -e~ aw E[e- a{Dl+ - +Dj) ], 

under the CARA utility function by Al-(i), where the expectation is with respect to the 
total expense D\ +... + Dj and the number J of visits which depends on 9. The expected 
utility (3) of a ( 9 , a)-patient buying coverage (t, dd, 7 ) becomes 

V(t, dd, 7 ; 9, a) = —e~ aw E[e~ aX ], 

where X is the out-of-pocket expense X = (.Di + ... + Dj) F(D l + ... + Dj < dd) + (dd + 
(J — J^)^)lI(Di + ... + Dj > dd). When there is a finite number C of offered coverages, 
the insurer partitions the set of types 0 x A using the patients’ certainty equivalences to 
maximize his expected profit with respect to the contract terms (t c , dd c , y c ), c = 1,..., C. 

3 Identification with a Continuum of Contracts 

In this section, we consider the case in which a continuum of coverages is offered to 
each insuree. In particular, our identification analysis shows the key role played by the 
number of accidents. The model structure is given by the joint distribution of risk and 
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risk aversion F(-, ■) and the damage distribution H(-). Besides the specification of the 
CARA utility function and the Poisson distribution for the number of accidents, the 
identification problem is nonpar ametricj^] The problem of identification is to recover 
uniquely the structure [F(-, •),#(•)] from the observables. In the case of a continuum of 
contracts, we observe the contract purchased by each insuree (£, dd) and the J claims made 
by each insuree with the corresponding amounts of damages (Di ,..., Dj ). In Section 3.2, 
we observe J* claims with their corresponding damages (Di ,..., Dj*) because of the 
truncation at the deductible. 

We introduce some observed variables X characterizing the insuree and his/her car 
that are used by the insurer to discriminate insurees 9 10 Variables related to the insuree 
may contain age, gender, education, marital status, location and driving experience. Vari¬ 
ables related to the insuree’s car may include car mileage, business use, car value, power, 
model and rnakep] With the introduction of X with values in the support S\ C lR dimX , 
the model structure becomes [F(9, a\X), H(D\X)\ as we expect that such variables affect 
the insuree’s risk and risk aversion as well as the damage. For instance, the damage with 
an expensive car is likely to be larger than the damage with an inexpensive one. Let 
G(-|X) denote the observed deductible distribution conditional on X. It is crucial that 
all the variables used by the insurer to discriminate insurees are included in X. 

In identification studies of structural models, it is important to define the set of ad¬ 
missible structures that are consistent with the assumptions of the theoretical model. We 
formalize such assumptions on the structure and (9 , a, J, D, X). Specifically, the structure 
[F(-, -\X), H(-\X)} belongs to Fx x 'Hx as defined below. 


9 The problem of identifying nonparametrically the agent’s utility function is quite complex. In the 
context of auctions, the bidder’s utility function is not identified in general. Nonparametric identification 
is achieved with the help of exclusion restrictions using exogenous variations in the number of bidders 
as in Guerre, Perrigne and Vuong (2009) or with the help of additional data from ascending auctions 
as in Lu and Perrigne (2008). See also Campo, Guerre, Perrigne and Vuong (2009) for semiparametric 
identification when the bidder’s utility function is parameterized as CARA or CRRA. 

10 Variables that are not used to discriminate insurees can enter in the model through (9, a) which can 

be then viewed as aggregating observed and unobserved heterogeneity. 

n We can use the car value as a proxy for wealth w so that w is a variable in X. 
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Definition 1: Let Tx be the set of conditional distributions F(-,-\X) satisfying 

(i) For every x G Sx, F(-, -|x) is a c.d.f. with compact support Q(x) xA(x) = [9(x), 0[x)] x 
[a(x),afx)} C 9R ++ x 1R ++ , 

(ii) The conditional density /(•,■(•) > 0 on its support. 

Definition 2: Let Tlx be the set of distributions H(-\X) satisfying 

(i) For every x G Sx, H(-\x) is a c.d.f with compact support [0, d(x)] C 1R + with 
sll Px&s x d{x) < +oo, 

(ii) The conditional density h(- 1-) > 0 on its support. 

Assumption A3: We have 

(i) (D l7 ,,.,Dj)±(0,a)\(J,X). 

(ii) (Di, ..., Dj) | (J, X) are i.i.d. as H(-\X), 

(in) J _L (X,a)\9 with J\9 ~ V{6), i.e. Pr[J = j] = 

(iv) ( 9,a,J,X ) is i.i.d. with (9,a)\X ~ F(-,-\X) 

Assumption A3 parallels Al with X. Assumption A3-(i) implies that conditional on X, 
the amount of damage does not provide any information on his risk and risk aversion. For 
instance, conditional on X, damages depend on exogenous factors that are independent of 
(i 9, a). In the same spirit, Assumption A3-(ii) says that damages are mutually independent 
conditional on X. Regarding Assumption A3-(iii), the number of accidents J depends on 
the insuree’s risk 9 only, while the Poisson distribution follows the theoretical model of 
Section 2, where the insuree’s risk 9 is the expected number of accidents. By Assumption 
A3-(iv), (9, a, J,X) is i.i.d. across insurees. We maintain Assumption A3 throughout the 
paper. Lastly, this section assumes that the observed (t, dd) correspond to the optimal 
coverage schedule so that (11) and (12) are satisfied. 

3.1 Case 1: Full Damage Distribution 

Case 1 considers a continuum of coverages offered to each insuree as well the observation 
of damage for every accident whether it is below or above the deductible. It follows 
that H(-\X) is identified on [0,d(X)]. It remains to study the identification of F(-, -|A). 
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For the rest of Section 3, to simplify the notations, we suppress the conditioning on 
X. We first proceed by studying the identification of the distribution K(-) of certainty 
equivalence (5) of no coverage. The optimal contracts are characterized by (11) and (12). 
Equation (11) defines a one-to-one mapping between the certainty equivalence s and the 
deductible dd, while (12) defines a one-to-one mapping between dd and t. The key idea is 
to exploit the former mapping to identify the distribution of certainty equivalence from 
the observed deductible distribution G(-). This result is in the spirit of the nonparametric 
identification literature on auctions and contracts]^] We have G(dd) = Pr (dd < dd) = 
Pr (s(dd) < s(dd )) = Pr(s < s(dd)) = K(s) implying g(dd) = k(s)s'(dd), with s(-) being 
the inverse of dd(-) by monotonicity of the latter. Hence, 

G(dd) K{s ) 1 K(s) 


g(dd) k(s ) s'(dd) k(s 

Substituting the above expression in (11), we obtain 


dd'(s) 


G(dd) 


drj(s ,a(s) ,dd(s)) 
ddd 


n'(s,a(s),M(s)) s , L 


g(s,a(s),dd(s))E[d\s](l-H(dd))^ . , , , . , M ^ 

g{dd) I 7](s, a(s), dd(s)) 7](s,a(s),dd(s )) 

From (12), we have t' + (dd ) = —l/r/(s, a(s), dd(s)), where t + (dd ) = t[s _1 (d(i)] is the function 
relating the deductible to the premium. We also have dt' + (dd(s))/ds = —d[rj(s, a(s), cM(s))]” 1 
/ds, i.e. t+(dd) x dd'(s) = rj'(s,a(s),dd(s))/[rj(s, a(s ), dd{s))} 2 or equivalently t'^(dd) = 
[r/(s, a(s), dd(s))/[r](s, a(s), dd(s))} 2 ] x s^'(dd). Using this result, we can rewrite the pre¬ 
vious equation as 

G(dd) 


E[0|s](l -H(dd)) + 


dr](s ,a(s) ,dd(s)) 
ddd 


g(dd) | r](s,a(s), dd(s)) : 


+ t” + (dd) > = —t' + (dd). 


From the definition (9) of r)(-, •, •), its partial derivative with respect to dd is 


drj(s, a(s), dd(s)) 
ddd 


—g(s, a(s), dd(s )) 



h(dd) 

1 - H(dd) 


12 For auctions, see Guerre, Perrigne and Vuong (2000) and Athey and Haile (2007) where the mapping 
between the observed bid and the unobserved private value identifies the private value distribution. For 
contracts, see Luo, Perrigne and Vuong (2015) in the context of nonlinear pricing, and Perrigne and 
Vuong (2011) in the context of a procurement model with adverse selection and moral hazard. The 
mapping between the observed quantity/ price and the unobserved consumer’s type/firm’s efficiency is 
exploited to recover their underlying distribution, respectively. 
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Thus, the first-order condition defining the optimal deductible can be rewritten as 


E[9\dd](l - H(dd)) + 


G(dd ) 

g(dd) 


—t' + (dd ) 



h(dd) \ 
1 - H(dd)) 


+ t'[(dd ) 


-t' + (dd), 


where E[0|s] = E[0|d<i] because of the one-to-one mapping between dd and s. After 
elementary algebra, we obtain 

< s) = K(dd) + mddlil - H(dd))] + md) ) + 


showing that a(s) is identified as the right-hand side is observed or identified from ob¬ 
servables. In particular, E [6 \ dd] is identified by the expected number of claims made by 
insurees choosing the deductible dd given that all the claims are observed, i.e. E[0|dd] = 
E[J|dd]j^] Then, using (8) and (9) we have 

t' + {dd){<j) a - 1) 

s — 1 - 

a(s ) exp(a(s)dd)(l — H(dd )) ’ 

showing that the insuree’s certainty equivalence s can be identified from his choice of 
deductible dd and the knowledge of H(-), G(-), f + (-) and E[J\dd}. Thus, we have the 
following result. 


Lemma 3: Suppose that a continuum of optimal insurance coverages is offered to each 
insuree and all accidents are observed. Under AS, the pair [Kf),Hf)] is identified. 

It remains to investigate whether we can identify A sketch of the argument is 

as follows. From the moment generating function of the number of accidents J conditional 
on s, we identify the moment generating function of 9 given s in a neighborhood of zero. 
As is well known, the latter identifies -? 7 6 >| 1 s'(-1*)- Once we identify F 0 |g(-|-), we use K(-) to 
derive the joint distribution of (9,s). Identification of the joint density of (9, a) follows 
from the known one-to-one mapping between ( 9 , s ) and ( 9 , a) given by ( 5 ). It is important 
to note that the observed number of claims J plays a crucial role in identifying F 0 |g(-|-). 
This is possible because the Poisson distribution belongs to the class of distributions 
whose nonparametric mixture is identified. See Ran (1992). In contrast, if one only 

13 We have E[J|dd] = E[J|s] = E{E[J|0,s]|s} = E{E[J|0,a]|s} = E{E[J|0]|s} = E[0|s], where we have 
used A3-(iii) and the one-to-one mapping between (6, a) and (d,s). 
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observes whether there is an accident with the risk measured by the probability of such 
contingency 9 = 1 — e~ 9 , then is not identified because the nonparametric mixture 

of a Binomial distribution does not belong to the aforementioned class and thus is not 
identified. See Aryal, Perrigne and Vuong (2009). 

Formally, for a given certainty equivalence s, the subpopulation of insurees with cov¬ 
erage ( t(s),dd(s )) and their corresponding claims give the moment generating function 
as 

M J]s (t\s) = E[e Jt |A = S ] = E{E[e Jt |d,^]|5 = s} 

= E{E[e Jt \9,a]\S = s} = E {E[e Jt \9]\S = s} 

= E[e e ^\S = s} = M e] s(e t -l\s), (19) 

where the third equality follows from the one-to-one mapping between ( 9 , s) and ( 9 , a ) 
and the fourth and fifth equalities from A3-(iii) using the moment generating function of 
the Poisson distribution with parameter 9. In particular, (19) shows that the moment 
generating function Mj|s(-|s) exists for every t G 1R because 9 has a compact support 
given S = s. Moreover, letting u = e b — 1 shows that 

M 0 \ S (u\s) = M/|s(log(l + u)\s) 

for all u G (—1, +oo). Thus M e \s{-\s) is identified on a neighborhood of 0 thereby identi¬ 
fying F( 9 |s(-|s). See e.g. Billingsley (1995, p. 390)|^| 

The joint density of ( 9 , s) is f(9, s) = f(9\s)k(s), which is identified. From the known 
one-to-one mapping that transforms (d, a)' into (9,s)', namely T(9,a ) = [9,w — 

~ l)]/ a T with cj) a = f e aD dH(D) and H(-) known, we recover /(6*,a) as 

f(9,a) = f es (T-\9,a )) 

14 Alternatively, because Mg|s(-|s) exists in a neighborhood of 0, then all the moments of 6 given S = s 
are identified by _M^(0|s) = = s] for k = 0,1,..., oo. Since 9 given s has compact support, we 

are in the class of Hausdorff moment problems, which are always determinate, i.e., the distribution of 9 
given s is uniquely determined by its moments. 


dT~\9,a) 
d(9, a) 
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This result is formally stated in the following proposition. 

Proposition 1: Suppose that a continuum of optimal insurance coverages is offered to 
each insuree and all accidents are observed. Under AS, the structure [F(-, •), Hf)} is 
identified. 

3.2 Case 2: Truncated Damage Distribution 

We maintain the assumption that the insurer offers a continuum of optimal contracts 
to each insuree but we now consider that the damage distribution is not fully observed. 
Making abstraction of dynamic considerations, an accident leads to a claim if and only 
if the damage is above the deductible. Thus, we can identify the truncated damage 
distribution on [dd, d\. However, the deductible dd varies across insurees. In particular, 
for insurees buying full insurance, the deductible is zero thereby identifying the damage 
distribution on its full support [0, d\. Formally, H D \ dd f\ 0) = i^D|s , ('U) = H D \^ ta fi-\d,d) = 
Hd {-) by A3-(i). Thus, we have the following lemma. 

Lemma 4: Under AS, Hf) is identified. 

It remains to study the identification of Ff,-). Though the reported number of 
accidents J* is observed, instead of the true J , the argument is similar to Case 1. 
Specifically, reviewing the argument leading to Lemma 3, K(-) is identified if E[d|dd] 
is. Since accidents are reported only if the damage is above the deductible, we have 
E[d|dd] E[J*\dd\, where J* is the number of reported accidents. But J* given ( J,dd ) is 
distributed as a Binomial with parameters (J, 1 — H(dd)) by A3-(i,ii). Thus, E[J*\dd] = 
E{E[J*\J,dd]\dd} — E[J(1 — H(dd))\dd] = (1 - H(dd))E[J\dd] = (1 - H(dd))E[6\dd\, i.e. 
E[d|dd] = E[J*\dd\/(l — H(dd)). Hence, E[#|gM] is identified despite the truncated damage 
distribution leading to the identification of K(-). 

Turning to the identification of F(6,a ), we proceed as in Section 3.1. The moment 
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generating function of J* given s is 


Mj*\ S (t\s) 


E[e J ‘*|S = s] = E{E[e J ‘V ? S]|S = s} = E{E[e JH \J, dd]\S = s} 

E {[H(dd) + (1 - H(dd))^}' 1 \S = s } = E ^ e ^o g [H(dd)+(i-H(dd))e^ s = s | 


Mq\s 


e log[H(dd)+(l-H(dd))e t ] _ 


A/ S | S [(1 - //(rfrf))(e* - l)|a], 


( 20 ) 


where the fourth equality uses the moment generating function of the Binomial distribu¬ 
tion £>(,/, 1 — H(dd)), and the fifth equality uses (19) with t replaced by log [H(dd) + (1 — 
H(dd))e t \. Thus, we obtain 

M e \s(u\s) = Mj*\ S 

for u G (—(1 — H(dd),+ oo). The rest of the argument in Case 1 applies leading to the 
following proposition. 

Proposition 2: Suppose that a continuum of optimal insurance coverages is offered to 
each insuree and accidents are observed if and only if the damage is above the deductible. 
Under A3, the structure [F(-, •), Hf)] is identified. 



4 Identification with a Finite Number of Contracts 

We now address the identification of the model when only (say) two contracts are offered 
given X. The identification argument can no longer rely on the identification of the 
density of certainty equivalence as we cannot exploit the one-to-one mapping between the 
insuree’s certainty equivalence and his deductible choice. There is a continuum of s G [s, s] 
values, while there are only a finite number of deductibles. Consequently, the FOCs (14)- 
(18) characterizing (ti,ddi,t 2 ,dd 2 ) will not allow us to identify F(9,a). In addition to 
the key role played by the observed number of claims, we exploit sufficient variations in 
exogenous variables to achieve identification. A notable feature of Section 4 is that we 
do not require that the observed coverages (ti,ddi,t 2 ,dd 2 ) are optimal. Consequently, 
the results of this section apply beyond the case of monopoly to entertain data from 
other forms of competition among insurers. As before, we distinguish whether the full or 
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truncated damage distribution is observed. Regarding observables, for each insuree we 
need the pair of offered coverages (ti, ddi, t 2 , dd?), his choice of coverage, the number of 
accidents, their corresponding damages and the characteristics X. 

4.1 Case 3: Full Damage Distribution 

This case is the closest to Cohen and Einav (2007) who identify the joint distribution 
of risk and risk aversion under parametric assumptions. In this section, we show how 
insuree’s optimal coverage choice with a full support assumption and sufficient variations 
in some exogenous characteristics can identify nonparametrically /(d,a). In view of Co¬ 
hen and Einav (2007) empirical findings, our identification result is important for several 
reasons. First, the nonparametric identification of the joint distribution of risk and risk 
aversion offers more flexibility on the dependence between risk and risk aversion. Their 
empirical findings display a counterintuitive positive correlation between the latter. Sec¬ 
ond, their robustness analysis suggests that the offered contracts are suboptimal with 
their estimated positive correlation, i.e., the insurer could increase his profit by adjusting 
upward the current low deductibles that are more compatible with a negative correlation. 

Our identification results rely on a nonparametric mixture of a Poisson distribution 
for the number of claims. Specifically, the probability of the observed claims J conditional 
on the characteristics x is given by 

Pr [J = j\ x \= / e _e — dF elx (0\x) 

Jd(x) o' 

where the mixing distribution Fq\x(-\x) is left unspecified. Given that all the accidents 
and damages are observed, the damage distribution H(-\X) is identified. To establish 
identification of F(6,a\X), we proceed as follows. We first show the identification of 
p 0 |x(-|-) following an argument similar to Case 1. In the second step, we identify the 
conditional distribution F a \o t x{- 1-, •) at the frontier a(0, X) between the two sets «Si(X) 
and S‘ 2 (X) that partition @(A") x A(X) according to the coverage choices of insurees with 
characteristics X. In the third step, we make an exclusion restriction and a full support 
assumption involving some characteristics Z included in X to achieve identification of the 
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distribution F a |e,x('K ') 011 its support. 

For the first step, we exploit again the observed number of accidents. Using an argu¬ 
ment similar to that leading to (19) for the subpopulation of insurees with characteristics 
x, the moment generating function Mj\x(-\x) is 

Mj\ X (t\x) = E[e Jt \X = x] = E {E[e Jt \9,X}\X = x, } 

= E {E[e Jt \9]\X = x) = E = a;} 

= M e \ x {e t — l\x), 

where the third and fourth equalities follow from A3-(iii). Thus, fg\x("\’) is identified by 
its moment generating function 


Mg\ X (u\x) = Mj| X (log(l + u)\x) 


for all u G (—l,+oo). 

In the second step, we consider the probability that an insuree with risk 9 and charac¬ 
teristics X chooses the coverage (fi(X), ddi(X)) as intuitively this provides information 
about the insuree’s risk aversion a. To do so, we define a discrete variable y, which takes 
values 1 and 2 depending on whether the insuree chooses the coverage , ddi(X)) 

or (f 2 (X), dd 2 (X)), i.e., whether the insuree’s types (9, a) belongs to «Si(X) or <S 2 (X), 
respectively. Thus, y = 1 is also equivalent to a < a(9,X), where the latter is the inverse 
of the frontier (13), where (U, ddi, f 2 , dd 2 ) and H(-) now depends on X. Namely, a(9,X) 
is the inverse of 


e(a,x) = -. 

Cm e ° D V - H(D\X))iD 

Our identification strategy below exploits variations of this frontier in X. In particular, 
even if the deductible does not vary with X as with US data, the premium and possibly 
the damage distribution do depend on X. 

The probability of interest can then be written as Pr[y = 1| 9,X = x], which is 


F a \0x[a(9,x)\9,x 


fe\x,x(9\l, x)v\{x) 
fe\x{0\x) 
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by Bayes’ rule, where u\ (x) is the proportion of insurees with characteristics x choosing the 
coverage dd\{x)). The latter is identified from the data. Since fe\x(-\ m ) is identified 

from the first step, it remains to identify fo\ Xt x('\l, x). Applying the same argument as 
in Step 1 but conditioning on y = 1 as well, we obtain 

Mj lx , x [t\l,x] = E[e Jt \ X =l,X = x\ = E{E[e Jt \6 } a,X]\x = l,X = x} 

= M g \ Xt x[e f - l\l,x\, 

where the second equality follows from the equivalence between conditioning on ( 9 , a, x) 
and conditioning on ( 9 , a), while the third equality follows, as before, from A3-(iii). Thus, 
fe |x,v('|l, •) is identified by its moment generating function 

Me\ x ,x(u\l,x) = Mj| X)X (log(l + u)\l, x) 

for all u G (—1, +oo). Hence, F a ^ jX [d(9,x)\9,x] is identified for every 9 G [9(x), 9(x )] and 
x E Sx- 

To conduct policy counterfactuals the analyst may need to identify F(-,-\x) on the 
whole support Q(x) x A(x). This is the purpose of the third step. To do so, we partition 
the vector A" into (W, Z). Let S\y denote the support of W and Sw 1 \ W 2 denote the support 
of some variable W\ given some variable W 2 = W 2 - 

Assumption A4: We have 

(i) a _L Z\(9,W) 

(ii) \/(9,a,w ) G Se a w, there exists z G Sz\e w su °h that a(9,w,z) = a. 

Assumption A4-(i) is an exclusion restriction, i.e. Z does not affect risk aversion given 
risk and other characteristics W. The variable Z needs to be continuous and can be the 
car value, the reported annual mileage, the driver’s experience, etc. This gives 

F a \e,w,z( a (0’ w ’ z )\0,w,z) = F a \g tW (a(9,w,z)\9,w ), V(0,tu,z). 

Because the left-hand side is identified from the second step, sufficient variations in 
a(9,w,z) due to z can identify F a ^ } w('\9,w). This is the purpose of A4-(ii), which is 



a full support assumption. Similar assumptions (sometimes called large support assump¬ 
tions) have been made in various contexts. See Matzkin (1992, 1993), Lewbel (2000), 
Carneiro, Hansen and Heckman (2003), Imbens and Newey (2009) and Berry and Haile 
(2014) among others. In our context, this assumption can be interpreted as follows: For 
every individual with characteristics ( 6 , a, W), there exists some characteristics Z such as 
the car value or the mileage for which the insuree is indifferent between the two offered 
coverages. The full support assumption is sufficient to guarantee identification since 

F a \e,w(a\0,w) = F a \e,w[a{0-,w,z)\9,w ] = F a \e :WtZ [a(d,w,z)\d,w,z\, 

where the first equality uses the full support assumption and the second equality uses 
the exclusion restriction. Note that a(-, •, •) is identified in view of (13). The full support 
assumption guarantees that for every a on its support, there exists a known value z such 
that a = a(0,w,z). Identification of F(6,a\w,z) follows using the first step. This result 
is formally stated in the next proposition. 

Proposition 3: Suppose that two insurance coverages are offered to each insuree and all 
accidents are observed for each insuree. Under AS and Af, the structure [F(-, -|X), Hf\X)] 
is identified. 

Despite pooling due to both multidimensional screening, and a finite number of cover¬ 
age, Proposition 3 shows that the model primitives are identified by exploiting wisely 
the number of accidents and variations in some exogenous variable. In particular, our 
identification argument does not require optimality of the offered coverages. This is novel 
in the identification of models under incomplete information. 

4.2 Case 4: Truncated Damage Distribution 

The data scenario analyzed in Case 4 corresponds to typical insurance data, i.e., a finite 
number of contracts offered with claims filed only if damages are above the deductible. 
Case 3 has shown that observing a finite number of contracts does not prevent the non- 
parametric identification of the joint distribution of risk and risk aversion provided all 
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accident information is available and there is enough variation in some excluded exoge¬ 
nous variable. In contrast, the truncation on the damage distribution in Case 4 limits the 
extent of identification. Nevertheless, we show that F(-, -\X) is identified up to the knowl¬ 
edge of the probability to have a damage below the lowest deductible, i.e., H(dd 2 ( X)|A)p*| 
To simplify the notations, we let H C (X) = H(dd c (X)\X) hereafter. 

We note the relationship between 1 — Hi(X) and 1 — H 2 (X) which allows us to focus 
on identification only in terms of 1 — H 2 (X). Because a claim is hied only if it involves a 
damage above the deductible, we identify the truncated damage distributions 


H'A- \X) 


H(-\X)-H C (X) 
1 ~H C (X)) 


on [dd c (X), d(X)] from the subpopulation of insurees buying the coverage (t c (X), dd c (X)) 
for c = 1, 2. Differentiating the above equations and taking their ratio show that 


, Y hmx) l-ffrpp 
’ h\(D\X) 1-H 2 (X)’ 


( 21 ) 


for all D > ddi(X), where 0 < A(X) < 1. In particular, the function A(-), which is 
the ratio of the truncated damage densities, is identified from the data, while H(-\X) is 
identihed on [dd 2 (X), d(X)\ up to the knowledge of H 2 (X). 

We follow similar steps as in Case 3 with 9 = (1 — H 2 (X))6 replacing 9 while modifying 
the argument as J is unobserved. To identify the marginal density fg\ x ('\') °f $ given A", 
we exploit the observed number of reported accidents J*. Using a similar argument as in 
(20), the moment generating function of J* given (y, X), where y € {1,2} indicates the 

15 When two contracts are offered, it is never optimal for the insurer to offer full insurance, i.e. dd 2 {X) = 
0. Therefore, we cannot use the argument of Case 2 to identify H(-\X) and hence H{dd 2 {X)\X). 
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insuree’s contract choice, is 


Mj.\ XtX (t\c,x) = E[e J **|x = c, X = x\ 

= E{E[e J * t \J,X,X]\ X = c,X = x} 

= E { [H x (X) + (1 - H^X^Ylx = c, X = xj 
= E (E[e Jlog[Hx(x)+(1 ~ H x (x))et] \0, x,X]\x = c,X = x 


= E 


e 0[H x (X)+(l-H x (X))e t -l]| ;v = cX = 


X 




( 22 ) 


where the third equality uses the moment generating function of J* given (J,x,X), which 
is distributed as a Binomial £>(./, 1 — H x (X j) using A3-(ii), and the hfth equality follows 
from A3-(iii) and the moment generating function of the Poisson distribution. Thus, 

M e \ x ,x[u\c,x] = Mj, \ XtX 

for u G (—1 + H x (X),+ oo). In particular, the distribution of risk 6 given (x,X) is 
identified up to the knowledge of H x (X). 

Since 6 — (1 — H2(X))9, its moment generating function given (x,X) is 

M o\x,x( u \ c ’ x ) = M e\ x ,x(u(l - H 2 (x))\c,x) 


log 1 + 


u 


1 - HJX) 


C, X 


Mj*\ x ,x log (l + \1,X if C = 1, 

Mj*\ XtX [log (1 + u) |2, x] if c = 2, 


(23) 


for all u G (—A(x),+oo) and u G (—l,+oo), respectively. Thus, the moment generating 
function of 9 given X is 


M lx (u\x) 


E{E[e“ e |x, X]\X — x] 
u 


M 


J*\x,x 


log 14 


A(x) 


11, x 


vi(x) 


+Mj*\ XtX [log(l-fu) \2,x]u 2 (x), 


(24) 


for u G (— X(x), 4-oo), showing that f§\ x (-\-) is identified as A(X), i'i(X) and iy 2 (X) are 
known from the data. Since fg\x{9\x) = (1 — H 2 (x))fg, x ((l — H 2 (x))9\X), the former 
density is identified up to H 2 (x). 
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In the second step, as in Case 3, we consider the probability that an insuree with risk 9 
and characteristics X chooses the coverage (tipf), ddfiX)). Using (13) and 1 — H(D\X) = 
(1 — H 2 (X))( 1 — H^D |A")), we remark that the optimal frontier between buying the two 
coverages in the space (9, a) is given by 


hjxy-hjx) 

- C$1 - m D \X)]dD 


(25) 


leading to the inverse a(9, X), which is identified. As before, from Bayes’ rule we have 


F a\e,xW' x ) \Q’ X ) 


x)v\(x) 

fe\x(@\ x ) 


(26) 


where v\{x) and f^ x (9\x) are identified. Moreover, x (-|l, x) is identihed because its 
moment generating function Mq, x (-\1,x) is identified on (— X(x, ), +oo) as shown above. 

In the third step, we note that F a ^ x (a{9,x)\9,x) = F a \ dtX (a{9,x)\9,x) thereby identi¬ 
fying the latter up to H 2 {x) since 9 = (1 — H 2 (x))9. Under A4, the rest of the argument is 
similar as in Case 3 leading to the identification of F a \e,w{'\'i •) and then of F(9,a\W,Z) 
up to the knowledge of H 2 (X). We have then proved the following result. 


Proposition 4: Suppose that two insurance coverages are offered to each insuree and 
accidents are observed only when damages are above the deductible. Under AS and A4, 
the structure [Ff, ,*|A1), Hf\X)] is identified up to H 2 (X). 

Up to now, we have not used the optimality of the offered coverages. Specifically, we 
have not used the FOC (14)-(18) determining the optimal insurance coverages (fii(X), 
dd\(X), t 2 (X), dd 2 (X)). One might ask whether the use of these FOC may help in iden¬ 
tifying some features of the structure or even the full structure itself. For instance, we 
note that (18) identifies a( X) because the latter solves the identifying equation 

tip0 = I ( } (e“ (X)D - e^ x)ddl ^ x) ) h*(D\X)dD , 

®P0 Jddpx) 

using h(D\X) = [1 — H 2 (X)]h 2 (D\X) and 9(X) = 9(X)[1 — H 2 (X)\. A consequence of 
Proposition 4 is that the structure [F(-, -|X), Hf\X)] is identihed if and only if H 2 (X) 
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is identified. The next lemma shows that H-fiX) is not identified even when considering 
coverage optimality through the FOC (14)-(18). 

Lemma 5: Suppose that two insurance coverages are offered to each insuree and accidents 
are observed only when damages are above the deductible. Under AS and A4, H 2 (X) is 
not identified. 

The proof is given in the appendix. It relics on exhibiting an observationally equivalent 
structure. The nonidentification may be surprising but can be explained as follows. It 
arises from a compensation between the increase (decrease) in the number of accidents and 
an appropriate decrease (increase) in the probability of damages being greater than the 
deductible. From the insuree’s perspective, such a compensation maintains the relative 
ranking between the two contracts. Thus, if a ( 9 , a)-insuree buys (fi(X), ddfiX)) then the 
((1 — H 2 (X))9,a )-insuree also buys the same coverage if there is an appropriate increase 
in the probability of damages being greater than ddfiX). From the insurer’s perspective, 
the decrease in the average number of accidents is compensated by an appropriate increase 
in the probability that the damage is above the deductible. Thus the expected payment 
to the insuree remains the same under either coverage. 

5 Discussion and Model Restrictions 

This section discusses identification strategies for the probability H 2 (X) and characterizes 
all the model restrictions on observables associated with the model of Case 4. 

5.1 Identification Strategies for Ho(X) 

From Section 4.2, any assumption that identifies H 2 (X) identifies the structure [F(-, -|A"), 
Hf\X)] on its support. We discuss some identifying assumptions/conditions for H 2 (X) as 
well as its partial identification. A first strategy to identify H 2 (X) is to parameterize the 
damage distribution Hf\X) as H(-\X]j3) on [0, d(X)] with /3 G B C lR q . Observations 
on reported damages D* identify (3 and hence H(-\X) on [0, d(X)]. Thus H 2 (X) = 
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H(dd 2 (X)\X; ft) is identified. In particular, we can choose a parametrization to fit the 
estimated truncated damage distribution H*(-\X). 

A second strategy is to consider additional data sources on the average of either the 
number of accidents or the damages. For instance, suppose that for every x G S x , we 
know the average number of accidents = E[J|A" — x\ — E{E[J|0, A" = x\\X = 

a:} = E[01A = x] by A3-(iii). For the average number of reported accidents, we have 

f4{x) = E[J*\ X = c, X = x] = E{E[J*| J, x = c , X = x]\ X = c,X = x} = E[J(1 - 

H c (X))\x = c, X — x] — [1 — H c (x)]E[6\x = c, X = x] for c = 1, 2 since J* given (J, y, X) 

is distributed as a Binomial with parameters (J, 1 — H x (X)). Thus 


n(x) = u 1 (x)E[6\x — 1, X — x] + u 2 (x)E[9\x = 2, X = x] 

1 ( ( ^f4( X ) , / N */ ^ 

u nx)—— + v 2 [x)h 2 [x > 


1 -Ho(x) 


X(x) 


This leads to the identification of H 2 {x) given that i/ c (x), /x*(x),c =1,2 and \(x) are 
identified from the data as shown in Section 4.2. Alternatively, suppose that we know 
only E[J|A = rro] for some xq. Using the same argument establishes the identification of 
H 2 (x o). This combined with a support assumption such as 6(x) = 9 for every x identifies 
H 2 (x). Specifically, note that we have 9(x) = (1 — H 2 (x))9(x), where 6{x) is the upper 
boundary of the support of f§\ x (-\X = x), which is identified as shown in Section 4.2. 
Applying this equation at Xq identifies 6 by 9(x o)/(l — H 2 (xq)). Applying again this 
equation at different values x identifies H 2 (x). A similar argument applies at the lower 
bound 9(x) = 9. 

Regarding damages, we note that 


E(D\X = x) = H 2 (x)E[D\D < dd 2 (x),X = x] + (1 - H 2 (x))E[D\D > dd 2 (x),X = x], 

where E[D\D > dd 2 (x),X = x] is identified from the data. Thus, for every x it is straight¬ 
forward to see that identification of H 2 (x) requires to know both E[D\D < dd 2 (x), X = x] 
and E(D\X = x). In particular, the knowledge of the latter is not sufficient, in contrast 
to the previous case in which the average number of accidents was sufficient for identifi¬ 
cation. As above, if one knows E[D\D < dd 2 (xo),X = a; 0 ] an d E(D\X = x 0 ) for some x 0 
and if either 9(x) or 9(x) is independent of x, then H 2 (x) is identified for every x. 
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A third strategy is to derive some bounds on the probability H 2 (X). This approach 
also known as partial identification was popularized by Manski and Tamer (2002) and 
Chernozhukov, Hong and Tamer (2007). See also Haile and Tamer (2003) and Kovchegov 
and Yildiz (2009) for nonparametric bounds. Our bounds are in the spirit of the latter 
as they are nonparametric. Let -|A"), i/ 0 (-|A")] be the true structure. Given an 

arbitrary value x, Proposition 4 implies that it is sufficient to determine the identified 


The proof of Lemma 5 shows that any value H 2 (x) = 1 — ( 1 / ac )[1 — H 2 (x)\ for k > 
sup*[l -Hi (x)] is observationally equivalent to H 2 {x). Thus, the identified set for H 2 {x) 
contains the interval 


set for H 2 \ 


x), 


i.e. 


the set of values H 2 (x) 


that are observationally equivalent to H 2 


(x) 


16 


-H° 2 (x) 


sup £ [l - H$(x)Y 


For values x for which 1 — H 2 (x) is close to the supremum, the left boundary approaches 
zero. Hence, the identified set is close to (0,1), which is not informative. 

To tighten these bounds, we may rely on some empirical evidence in Cohen and Einav 
(2007). In particular, their estimated damage density decreases when the damage ap¬ 
proaches the deductible from above suggesting that the density below the deductible is 
not greater than its value at the deductible. Thus we can assume that the damage den¬ 
sity satisfies h(D\x) < h[dd 2 (x)\x] for every D < dd 2 (x) and x € Sx■ Integrating both 
sides from 0 to dd 2 (x) we obtain 0 < H 2 (x) < dd 2 (x)h(dd 2 (x)\x). Dividing both sides by 
1 — H 2 (x), and using the definition of the truncated density h 2 (- |x), we obtain 

0 < H2 jj] \ < dd 2 (x)h* 2 (dd 2 (x)\x). 

1 - H 2 (x) 

Solving for H 2 (x) gives the bounds 


0 < tt / n < dd 2 (x)h*(dd, 2 (x) |x) 

— 2 ~ 1 + dd 2 (x)h 2 (dd 2 (x)\x) 


B(x). 


In particular, the upper bound for H 2 {x) is strictly less than 1. Moreover, a useful feature 

16 To be precise, this is the set of values H 2 (x) corresponding to structures [F(-, -|X), H(-\X)] that are 
observationally equivalent to [F°(-, -\X), H°(-\X)]. 
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of this upper bound is that it can be estimated as it depends on observablesp] 


5.2 Model Restrictions 


This section derives the restrictions imposed by the model on observables under the data 
scenario of Case 4, i.e., a finite number of contracts and a truncated damage distribution. 
We can use these restrictions to test the model and its assumptions. For every insuree, we 
observe [J*, ..., D* Jt , y, T, DD, X ], where D* denotes the damage for the jth reported 

accident and (T, DD) are the premium and deductible chosen by the insuree. From the 
model, T and DD are given by T = t x (X) and DD = dd x (X), where t x (X) and dd x (X) 
for y = 1,2 are functions of X satisfying the first-order conditions (14)-(18). Thus, 
the vector of observables has a joint distribution T(-,..., ■) with a density -0(-, = 

■ ■ • , ’h •, •) X -0J*Ix,*('h ') X ^xI-y('I') X V’x(')- 
The next lemma provides necessary and sufficient conditions on the joint distribu¬ 
tion T(-,...,-) to be rationalized by a structure [F(-, •[•), f/(-|-)] G T x x 9i x . Let 
T-L* x be dehned as the set l-i x in Dehnition 2 with the difference that the support is 
[dd c (X), d(X)) for c = 1,2. We introduce the remaining notations to write the model 
restrictions implied by the full support assumption and the first-order conditions (14)- 
(18). The insurer’s expected payment per accident given the coverage c and character¬ 
istics x is denoted E[P|c, x] = J^ x ^(l — ^ d*\ x ,x(D\c, x))dD for c = 1,2. Let 9(a) = 
9(a,x) and a(6) = # _1 ($, x) as in (25) with H%(D\X) = d *\ xX (D\2, X). In particular, 

9(-) and a(-) are known from T(-,..., •). Let x (*|-, ■) and fg\ x ('\') be the densities 
given by the moment generating functions (23) and (24) with v c (x) = ^> x \ x (c\x) for 
c = 1,2 and A(x) = t/fD*|x,x(’|2, x)/if)D*\x,x(-\l, %)■ These densities are also known from 
T(-,..., •). We denote by 9 = 9(x) the lower bound of the support of fg\ x ('\')- Let 
fo,a\x(-r\-) = /a| 0 ,x('K-)/e>(-|-), wh ere ■) is obtained from (26) using A4. Let 

17 Similarly, exploiting the relationship 1 — H 2 (x) = [1 — Hi(x)\/\(x) we obtain 


1 — X(x) < Hi(x) < 1 — 


A(x) 


1 + dd2(x)h.2(dd2(x)\x) 


The lower and upper bounds for H\(x) are strictly larger than zero and smaller than one, respectively. 
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[a, a] = [ gfx ), a(x)\ be the support of f a \x(-\x), while a* = a*(x) = min{a, a(9, a;)}. Lastly, 
we define 

p(x) = i/} XjX (l,x) + J ti(x)-0(a)E[P|l,x] f §a]x (9(a),a\x)^j^-da 
~ J t 2 (x)-6(a)E[P\2,x\ f §MX (e(a),a\x)^j^da, 

which expresses the Lagrange multiplier in terms of observables using (14). 

Lemma 6 (Rationalization Lemma): Let \l/(-, ,..,•) be the distributioii of (J*, D \,..., 
D* Jt , x, X). Under AS and Af, [F(-, •(•), H{- 1-)] G Tx x Six rationalizes \1>(•,..., -) if and 
only if the latter satisfies the following conditions: 

(i) . ■■■,-|v, ■) = nbi )■ where ’I'|.. .v '-I-■ ■) = 'I'/.- ,v' ^ ■) 

e W xX , 

(ii) For all x G Sx, ^d*\x,x{- | 2 , x ) and x) are strictly positive on [dd 2 (x), d(x)] 

and [ddi(x), d(x)], respectively. Moreover, their ratio \(x) is independent of d G [ddi(x), 
d{x)} with 0 < A(x) < 1, 

(Hi) For every ( 9,x ) G S§ x 

fe\ x ,w,z [9\l,w,z\ip xl w,z(l\w,z) 

-~-1 Z G e> 

fe\w,z(@\ w i z ) 

(iv) The coverage terms ti(-),t 2 (-), ddi(-), dd 2 (-) satisfy 0 < ti(-) < t 2 (-), df-) > ddi(-) 
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> c?c? 2 (*) > 0, and 

J t 1 (x)-9(a)E[P\l,x] fg MX (9(a),a\x)^^-da + E[J*\l,x]7p Xj x,z(l,x) 

~J t 2 (x)-9(a)E[P\2,x] f§ MX (0(a),a\x)^^-da - p(x)9e^ ddl{x) = 0 (27) 

J t 1 (x)-9(a)E[P\l,x) f §a]x {9(a),a\x)^^-da + ^ x \ x (2\x) 

-J t 2 (x)-9(a)E[P\2,x] fe,a\x^( a )^ a \ x )-^ da = 0 ( 28 ) 

J t 1 (x)-9(a)E[P\l,x] fg alx (9(a),a\x)^^da + E(J*\x = 2,x)^ XtX (‘2\x) 

~ J a t 2 {x)-9{a)E[P\2,x\ fe ia \x{h a )^ a \ x )^^ da = 0 ( 29 ) 

ti(x) = = [ ( } (e^ D ~ e ^ dddx) ) 4> D .\ XiX (D\l,x)dD . (30) 

Ql Jddi(x) 

Condition (i) says that reported damages are independent and identically distributed 
given the coverage choice and individual characteristics. In addition, reported damages 
are independent of the reported number of accidents given these variables. This is a 
consequence of A3-(i, ii) on damages and number of accidents. Condition (ii) requires that 
the densities of reported damages, given coverage choice and individual characteristics, 
are strictly positive on their supports. More importantly, the ratio of these densities needs 
to be independent of the level of reported damage following (21). This property is also a 
consequence of A3-(i, ii), i.e., damages are i.i.d and independent from the coverage choice 
and hence from ( 9 , a). Condition (iii) says that the probability for choosing coverage 1 by 
a ( 9 , a)-insuree takes all values in [0,1] as the characteristic Z varies. This follows from 
(26) and the full support condition in A4-(ii). Condition (iv) relates the distribution of 
observables to the coverage terms. In particular, it requires that the optimal premium 
and deductible for the two coverages must satisfy the FOC (14)-(18). There is also a fifth 
condition that follows from the compact support of the joint distribution of risk and risk 
aversion and its non-vanishing density in Definition 1. This technical condition is given 
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in the Appendix. 

The rationalization lemma is important for several reasons. First, the insurance model 
with multidimensional private information does impose some restrictions on observables. 
In view of bunching due to multidimensional screening, and a finite number of coverages, 
one could have expected otherwise. For instance, in auction models, a restriction arises 
from the monotonicity of the equilibrium bidding strategy, which is not present here be¬ 
cause of the finite number of contracts. Second, Lemma 6 characterizes all the restrictions 
on the distribution of observables that we can use to test the validity of the model and 
its assumptions. Violation of a single restriction by the data would reject the model. We 
can then develop some testing procedures for each condition. For instance, we can test 
(i) using conditional independence tests. See (say) Su and White (2008). We can test the 
independence of \(x) from damage by noting that the ratio of the densities is equal to 
' l l J D*\x,x(ddi(x)\2, x) /'ipD*\ x ,x(ddi(x)\l, x). We can then derive a Cramer-von Mises type 
test relying on nonparametric estimates of the densities following Brown and Wegkamp 
(2002). Condition (iii) implies that the full support assumption in A4 is also testable. 

Third, (iv) provides restrictions on the coverage terms suggesting that we can test 
their optimality. This contrasts with the previous structural literature in which one 
assumes that the observations are the outcomes of some equilibrium. For instance, in 
auctions, identification relies on the optimality of observed bids. This represents a strong 
assumption that might be questionable from an empirical point of view. When the number 
of contracts is finite, we do not use optimality of the coverage terms to identify the model 
structure. Thus, we can use (27)-(30) to test the optimality of the observed coverages 
(Ti, DD\, T- 2 , DD 2 ) in the case of a monopoly. From an empirical point of view, the system 
(27)-(30) gives the optimal coverages from observables. Hence, it allows us to assess the 
profit loss for the insurer from using the actual coverages. Fourth, because restrictions (i)- 
(iii) do not require that the insurer is a monopoly, they are also valid to test Assumptions 
A3 and A4 under alternative forms of competition in the insurance industry. 
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6 Conclusion 


Our paper addresses the identification of insurance models with multidimensional screen¬ 
ing, where insnrees have private information about both their risk and risk aversion. Onr 
model also includes a random damage and the possibility of multiple accidents. Screening 
of insnrees relies on their certainty equivalence. Specifically, we investigate how data avail¬ 
ability on the number of offered coverages and reported accidents affects identification of 
the model primitives through several data scenarios. Overall, the number of accidents 
plays a crucial role and we identify the model structure despite bunching due to multi¬ 
dimensional screening and/or the finite number of offered coverages. In particular, our 
identification results under a finite number of coverages apply to any form of competition. 
Specifically, they identify the distribution of inusrees’ risk and risk aversion for each firm 
in the industry. In addition, we provide all the restrictions imposed by the model on 
observables. An interesting feature is that optimality of the offered finite coverages can 
be tested separately as identification of the model does not rely on this property. 

In terms of future lines of research, first our results extend to a broad range of insurance 
data such as in health provided the analyst observes a repeated outcome, e.g. insurees’ 
claims. In particular, we may want to extend our identification results when damages are 
no longer mutually independent and correlated with insuree’s private information to allow 
for moral hazard. Second, in the case of automobile insurance, we could endogenize the 
car choice given insuree’s risk and risk aversion. This would lead to a model explaining 
the car choice, the coverage choice, the number of accidents and the damages. Third, our 
identification results are constructive and thus provide explicit equations for developing 
a nonparametric estimation procedure. Our model restrictions can be used to develop 
a test of the model validity and of the coverage optimality. These restrictions are also 
the basis for testing adverse selection in insurance within a multidimensional private 
information setting. Several existing data sets on automobile and/or home insurance used 
in Israel (2005a,b), Cohen and Einav (2007), Sydnor (2010) and Barseghyan, Molinari, 
O’Donoghue and Teitelbaum (2013) can be reanalyzed in view of our results. 
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Appendix 


Proof of Lemma 1: The derivatives of the certainty equivalences (5) and (6) with respect to 9 
give —(4> a — 1 )/a and — ((/>* — l)/a, respectively. Since 4> a > 1 and (/>* > 1, we obtain the desired 
result. Regarding the derivative of (5) with respect to a, we obtain 

dCE(0, 0; 9, a) aE[D exp(aD)] — E[exp(aZ))] + 1 

da a 2 

It suffices to show that the numerator in brackets is positive. It is equal to E[aD exp(aD) — 
exp (aD) + 1], Let X = aD, it is easy to show that X exp(X) — exp(X) + 1 is an increasing 
function equal to 0 at A = 0. Since aD > 0, the numerator is positive and hence the derivative 
is negative. A similar argument applies to CE(t, dd] 9, a) by letting X = a min(dd, D). □ 

Derivation of First-Order Conditions (11) and (12): The Hamiltonian is 

H(t(s), dd(s)) = t(s)-E(9\s) [ (1 - H(D))dD k(s ) 

Jdd(s ) 

+v(s)t'(s) + y(s)dd'(s ) + r(s) [dd'(s) + r/(s, a(s ), dd(s))t' (s)] , 


where t(s ) and dd(s) are the state variables, t'(s ) and dd'(s) are the control variables, v(s), y(s ) 
and r(s) are the co-state variables. The first-order conditions are 


dH 

dt'(s) 

dH 

ddd'(s) 
dH _ 
~~dt ~ 
dH 
ddd 


v(s) + r(s)r](s 1 a(s), dd(s)) = 0 
= y(s) + r(s ) = 0 


— k(s) = v'(s) 

- Ewa - Hm)k(s )+ r(s) ^.ywv M 


y'(s) 


with transversality conditions y(s) = 0 and v(s) = 0. Integrating the third equation and 
using the transversality condition v(s)= 0 gives — K(s ) = v(s). The hrst two equations give 
K(s) — y(s)r][s, a(s), c?d(s)] = 0. Using r(s) = —y(s) and (8) in rewriting the last equation give 
the desired result. □ 


Proof of Lemma 2: Let s' > s and 9 be fixed and arbitrary. Following (6), the certainty 
equivalence when buying insurance can be written as 


CE(t(s),dd(s ); 9, a) = w — t(s ) — m(dd(s), s ), 
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where m(dd(s),s) = {9 / a)[Jg d( ' s ' 1 e aD dH(D) + e add ( s )(\ — H(dd(s ))) — 1] and (8, a) is such that 
s(6,a ) = s. The (IC) constraints for s and s' give 

w — t(s ) — m(dd(s), s) > w — t(s') — m(dd(s / ), s) 
w — t.(s') — m(dd(s'), s') > w — t(s ) — m{dd{s), s'). 


Adding the two inequalities give upon simplification 

m{dd{s ), s) — m(dd(s), s ) > m(dd(s'), s') — m(dd(s), s'). 


Since m(-, •) is differentiable in both arguments, we get 


f dd ( S ') ^ ^ fdd(s') dm ^ ^ 

Us) ^ - Jdd(s) d£ e 


r dd (*') f dm(g,s) _ dm(£, s') 

ldd(s) [ di 

fdd(s') rs&mfay) 


d£> 0 


J dd(s) J s' 

Differentiating m(£, y) with respect to £ gives 


dyd^ > 0 . 


dm(£,y) 


= 9e<{ 1 - ff(0). 


(A.l) 


Because 0 is fixed and s{9 , a) = y, then differentiating with respect with y using a(y) gives 

d2 g^ y y) = 0a'(y)!ie a ^( 1 - ( 0 ) < 0 , 


since a(-) is decreasing in s by Lemma 1. Thus, the inner integration in (A.l) is positive. Hence 
(A.l) holds if and only if dd(s') > dd(s). □ 


Proof of Lemma 5: In view of Proposition 4, H-^X) is identified if and only if the struc¬ 
ture [F(-, -\ X), H(-\X)] is. Thus, it suffices to show that the latter is not identified. Let 
[F(-,-\X),H(-\X)] be a structure satisfying Definitions 1 and 2 as well as A3 and A4. We 
construct a second structure [F(-,-\X),H(-\X)] as follows. Let 6 = k9 with k > sup^g^fl — 
H- 2 {x)\ > 0, while a = a so that /(•,• \X) = (1/k)/(-/k,-|A). Let h(-\X) be a strictly positive 
conditional density on its support [0,d(A)] with h(D\X) = (l/n)h(D\X) for D > ddz(X). Be¬ 
cause 0 < h(D\x)dD < 1, it follows that k > 1 — H2(x) for all x G Sx as required above. 
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The second structure -|X), H(-\X)] satisfies Definitions 1 and 2 as well as A3 and A4 as 
0(a,X) = K0(a,X). 

We now show that these two structures are observationally equivalent, i.e. they lead to the 
same distribution for the observables (</*, Df,..., Dj*,x, h, dd\,t 2 , dd 2 ) given X, where J* and 
D* refer to the number of reported accidents and their corresponding damages, respectively, 
while x indicates which coverage is chosen by the insuree. First, we note that the coverage 
terms are deterministic functions of X solving the FOC (14)-(18). Thus, from (25) the optimal 
frontier for the second structure must be 

t 2 (X)-h(X) t 2 (X)-h(X) 


0(a,X) = 


Cm e “ D (! - H(D\X))dD e“ D i(l - H(D\X))dD 


>dd 2 (X) 


ddi(X) D i 


dd 2 (X) 


= K0(a,X), 


thereby showing that the highest risk aversion in A\ is a*(X) = a*(X ). 

Regarding the distribution x given X, we note that x = X • The latter follows from x = 1 
if and only if (0,a) G A\(X), i.e. 6 < 0{a,X) and a(X) < a < a*(X). Since 9 = k 9 . 
0(a,X) = K0(a,X) and a*(X) = a*(X), we have x = 1 if and only if x = 1- Thus, the 
distributions of x and £ given X are the same, i.e. 9 C (X) = is c (X) for c = 1,2. Regarding the 
distribution of J* given (x,X) = (x,X), from (22) its moment generating function is 

M 0 - |xX [(l- J R x (A))(e t -l)|c,x] = Mq\ x x[{^ — H x (X))(e t — l)|c, x] 

= Mj.\ XtX [t\c,x] 

using 1 — H C (X) = (1 — H C (X)) /n, and ^ x (u\c,x) = Mq\ Xj x(ku\c,x). Hence, the distribution 
of J* given (x, X) is the same as that of J* given (x, X). Regarding the distribution of reported 
damage D* given (J*,x,X) is 

~ H(-\X)-H x (X) H{.\X)-H x (X) _ 

tt / i zt (v\ 


1 ~H x (X) 


1 - H x (X) 


using 1 - H x {-\X) = (1 - H x (- \X))/k. 

Lastly, it remains to show that (ti(X), ddi(X),t 2 {X), dd 2 (X)) satisfies the FOC (14)-(18) as¬ 
sociated with the second structure. Using 0(a, X) = nO(a,X), f(0(a,X),a\X) = f(0(a,X)/K,a\ 
X)/k = f(0(a,X),a\X)/K, 1 - H(D\X) = (1 - H(D\X))/k, v c = v c and E[0\ A c ] = kE[ 9\A c ], 
it can be easily verified that (ti(X),ddi(X),t 2 {X),dd 2 ( X )) satisfies (14)-(18) with p = p as 
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soon as (14)-(18) hold for the original structure. Hence, the two structures lead to the same 
distributions for the observables as desired.□ 

Additional Condition in Lemma 6: 

(v) For c = 1,2 and all x E Sx, ipj*\x,x(’\ c i x ) > 0 on IN with a moment generating func¬ 
tion defined on 1R such that the right-hand sides of (23) are the moment generating functions 
of absolutely continuous distributions with densities bounded away from zero on their supports 
[0(1, x), 0(1, x)] and [0(2, x), 0(2, x)] with union equal to [0(1, x), 0(2, x)] included in IR++- More¬ 
over, S a \g w = {a : 3z E S z \§ w , a = a(8,w,z)} is a compact interval in 1R ++ independent of 9. 

Condition (v) states that the support of the distribution of reported accidents, given cover¬ 
age choice and individual characteristics, is the set of integers. The remaining part of (v) follows 
from the compact support of F(9,a\X), and its non-vanishing density. The conditions on the 
moment generating function of J* given (%, X) can be replaced by conditions on its character¬ 
istic function <^j*| Xi x(‘l c > x )- Specifically, </>j*| x ,a'('I c > x ) is an entire characteristic function such 
that the right-hand sides of (23) are characteristic functions corresponding to absolutely con¬ 
tinuous distributions with densities bounded away from zero on their supports [0(1, x), 0(1, x)] 
and [0(2, x), 0(2, x)] with union equal to [0(1, x), 0(2, x)] included in 1R ++ 

Proof of Lemma 6: We first prove necessity. Let [F(-, -I-), i4(-|-)] E Fx x Fix be a structure 
that rationalizes T(-,..., ■) under A3 and A4. To prove (i) we follow Guerre, Perrigne and Vuong 
(2000) proof of Theorem 4 (Conditions C1-C2). From A3-(i,ii), we have (D \,..., Dj) i.i.d as 
H(-\X ) conditional upon (J, 0, a, X). Thus, J* follows a B[J, 1 —Fi x (X)] given (J, 0, a, A) since an 

18 Such conditions can be written equivalently in more testable forms. For instance, a function is a 
characteristic function if and only if it satisfies Bodmer's Theorem 4.2.2, and it is entire if and only if 
it satisfies Theorem 7.2.1. A characteristic function corresponds to a distribution with bounded support 
in 1R++ if and only if it satisfies Theorem 7.2.3 with (7.2.3) strictly positive. These theorems and 
equations are from Lukacs (1960). A well-known sufficient condition for a distribution to be absolutely 
continuous is that its characteristic function is absolutely integrable, while a necessary condition is that 
the characteristic function vanishes in the tails. See Billingsley (1995, pp.345-347). 
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accident is reported if and only if the damage is above the deductible. For any (d \,..., dj) G IR x 


Pr [D\ < di,..., D* < dj, J* = j\J, 9, a, X] 

= Pr [dd x (X) <D ri <d\,..., dd x {X) < D r . <dj, D r <dd x (X),r^{r\,... ,rj}\J, 6, a, X] 

J\ 

— -Pr[dd x (X) <Di<d\,..., dd x {X) <Dj <dj,D r <dd x (X),r=j + 1 ,..., J\ J, 9, a, X] 


jKJ-JV- 

J! 


,, , ' /:i , [U l H ( d r\ X ) - H x( X )}J [H X (X)} 


J-3 


because {D\,... ,Dj) are i.i.d. as H(-\X) given (J, 6, a, X). Since J* is B[J, 1 — H x (X)\ given 
(J, 9, a, X) we obtain 


Pr [Dl <di,...,D* < dj\J* 


j 

j,J,9,a,X) = \[ 

r —1 


H(d r \X) - H x (X) 
1 - H x (X) 


showing that {D\,..., D*) are i.i.d as H*(X) G "H* Y given (J* = j, J, 9, a, X), and hence given 
(J* = 3,X, X )- Thus, (i) holds. 

To prove (ii), we note that ') = H x(') G Ti* xX thereby establishing the first part 

of (ii). Moreover, i>D*\x,x(d\2, x)/'ip D *\ XiX (d\l, x) = (1 - H\{x))/(1 - H 2 {x)) = X(x), which is 
independent of d G [dd\(x), d(x)] and in (0,1). Regarding (iii), for every ( 9,a,w ) G Sg a w , 


F a\e,w{a\0,w) 


Fa\e,w,z[a{0,w,z)\9,w,z\ 


fe\ x ,w,z{0\l,w, z)^ x \w,z(Mw, z) 
fe\w,z(6\ w > z ) 


fe\ x ,w,z^A w i z ) 

fe\w,zA\ w i z ) 

for some z G S z \g w , and where the first equality follows from A4, the second equality from 
Bayes’ rule, and the third equality from 6 = (1 — H 2 {X))9. Because a can be chosen arbitrarily, 
it follows that the right-hand side takes all values in [0,1]. Regarding (iv), let 9 = (1 — H 2 (X))9. 
The proof then follows the last paragraph of the proof of Lemma 5 with k = 1 — H 2 (X). 

To prove (v), we note that 


Pr[J* = j*\9, a, A] = ]T Pr[J* = j*\J = j , 9, a, X]Pr[J = j\9, a, X}. 

3 = 3 * 


Thus, J* given (9,a,X) is a mixture of a B[J, 1 — H x (X)] with a mixing V(9) distribution by 
A3-(iii)- That is, ^ a i x ) is a "P[(l — H x (x))9} distribution. Hence, ipj*\ x ,x('\ c i x ) = 
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Ja ^ a i x)dF(9, a\x) thereby establishing ipj*\ Xj x(’\ c i x ) > 0 011 Ff as -F(-,-|-) £ F X - 
The moment generating function of J* given (x, X ) exists on IR in view of (22) since the distribu¬ 
tion of 0 given (x, X) has a bounded support. The right-hand sides of (23) must be the moment 
generating functions of absolutely continuous distributions, with densities bounded away from 
zero on their supports [0(1, x), 0(1, x)] and [0(2, a;), 0(2, x)] with union equal to [0(1, x), 0(2, x)] 
included in 1R++, because they are the moment generating functions of 0 = (1 — H 2 {X))6 given 
(c, x), which have such properties. 

We now turn to sufficiency. Let the distribution T(-,..., •) of (J*, D *,..., Dj*,x, X) and 
the contract terms [ti(-), dd\(-), hi'), dd 2 {-)\ satisfy (i)-(v). We need to exhibit a structure 
[F(-, •)•), H(■]■)] G F x xT-L x satisfying A3 and A4 that rationalizes T(-, ...,•) of (J*, D *,..., Dj „, 
X,X) and [t 1 (-),ddi(-),t 2 (-),dd 2 (-)]. 

In view of the identification argument of Section 4.2, we define H(- 1-) as follows: For a 
constant k G (0,1), let H(D\X) = ki/j d *^ x (D\2, X) + (1 — k) when D > dd 2 {X). Note 
that H(-\X) has a strictly positive density on \dd 2 {X), d(X)] because 'L_d*i x ,x('|2, X) G T~L 2 x- 
For D G [0, dd 2 (X)\, let H(-\X) be arbitrary as long as it has a strictly positive density on 
[0 ,dd 2 (X)}. Thus, H(- 1.) G H x - Note that k = 1 - H(dd 2 (X)\X) = 1 - H 2 (X) so that 
iL|(- \X) = [H(-\X) — H 2 (X)\/[1 — H 2 (X)\ = ^'d*| X; x('|2, X) after straightforward algebra. 
Moreover, ip D *\x,x{D\2,X) = A(X) ^ d *\ x X {D\1, X) for D > dd±(X) by (ii) implying A(A) = 
1 - *D*\ Xt x[ddi( x )\2,X] by integration, and H{{- |X) = [H{-\X) - LTi(A)]/[l - H^X)} = 
X) after some algebra. Thus, • • • > 'h ' 1 ') * s rationalized given A3 

as long as x is a function of ( 9 , a, X) as implied by the theoretical model. 

To construct F(-, -|-) we follow the identification argument. Let /(0|c, X) = nf§\ x x (n9\c, X) 
and f(9\X) = k/^ x (k9\X), where these densities exist by condition (v). In particular, f(9\X) 
is strictly positive on its support [0(1, x)/n, 9(2, x)/k] C IR ++ . Turning to F a \e,w,z('\'rr) = 
F a \e,w{\i ’) by A4-(i), we follow (26). For every (0, w) G Sgw, let F a \ e w {-\9,w) have a strictly 
positive density on its support 5 a |g U) = {a : 3z G S z \§ w ,a = a(9,w,z)} = S a \ q w = {a : 3z G 
Sz\9w ,« = a(0, w, z)} satisfying 


F a\0,w[ a ( e i w -> z )\ e M 


fo\ x ,w,z$ \^w,z)^(l\w,z) 
f§\w,z(^\ w ’ z ) 


(A.2) 


for every ( 9,w,z ) G Sgwz , where 0 = k9 and a(9,w,z) = ci(k9,w,z). By (iii) the right-hand 
side has the range of [0,1] as z varies in S z \§ w for every given (0, w) G S§ w , he., for every given 
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(' 9,w) G Sow • Thus, for every (9,w) G Sow and every a G 5 a |0iD) there exists a z G Sz such 
that a = a(9,w,z ), i.e., A4-(ii) is satisfied. We can now extend F a ^ tW (-\9,w) over S a \ q w by 
Fa\e,wi a \0> w ) = F a \0,w[ a {9,w,z)\9,w\ using (A.2). Thus, F(-,-\-) G Tx as desired. 

The structure [F(-, •)•), i7(-|-)] constructed as above rationalizes ') because of 

(23) and the uniqueness of the corresponding density. This structure also rationalizes 'h x |x(-|-)- 
Specifically, by definition we have 


F a\e,w{a(S, w,z)\9,w) 


fe\ x ,w,z(0\l,w, z )M'w, z ) 
fe\w,z{0 \ w , z ) 


f§\x,w,zW 1 ' w ’ z ) Vl ( w ’ z ') 

fo\w,z$\ w i 2 ) 


Using (A.2) shows that v\ (w, z) = V’x|w,z(1|' u; j z ) as desired. The fact that the structure ra¬ 
tionalizes (ti(-),ddi(-),t 2 (-),dd 2 (-)) follows the argument of the last paragraph of the proof of 
Lemma 5. □ 
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